智能论文笔记

Towards Asteroid Detection in Microlensing Surveys with Deep Learning

Preeti Cowan , Ian A. Bond , Napoleon H. Reyes

分类：计算机视觉 | 机器学习

2022-11-04

Asteroids are an indelible part of most astronomical surveys though only a few surveys are dedicated to their detection. Over the years, high cadence microlensing surveys have amassed several terabytes of data while scanning primarily the Galactic Bulge and Magellanic Clouds for microlensing events and thus provide a treasure trove of opportunities for scientific data mining. In particular, numerous asteroids have been observed by visual inspection of selected images. This paper presents novel deep learning-based solutions for the recovery and discovery of asteroids in the microlensing data gathered by the MOA project. Asteroid tracklets can be clearly seen by combining all the observations on a given night and these tracklets inform the structure of the dataset. Known asteroids were identified within these composite images and used for creating the labelled datasets required for supervised learning. Several custom CNN models were developed to identify images with asteroid tracklets. Model ensembling was then employed to reduce the variance in the predictions as well as to improve the generalisation error, achieving a recall of 97.67%. Furthermore, the YOLOv4 object detector was trained to localize asteroid tracklets, achieving a mean Average Precision (mAP) of 90.97%. These trained networks will be applied to 16 years of MOA archival data to find both known and unknown asteroids that have been observed by the survey over the years. The methodologies developed can be adapted for use by other surveys for asteroid recovery and discovery.

translated by 谷歌翻译

A perspective on physical reservoir computing with nanomagnetic devices

Dan A Allwood , Matthew O A Ellis , David Griffin , Thomas J Hayward , Luca Manneschi , Mohammad F KH Musameh , Simon O'Keefe , Susan Stepney , Charles Swindells , Martin A Trefzer

分类：机器学习

2022-12-09

Neural networks have revolutionized the area of artificial intelligence and introduced transformative applications to almost every scientific field and industry. However, this success comes at a great price; the energy requirements for training advanced models are unsustainable. One promising way to address this pressing issue is by developing low-energy neuromorphic hardware that directly supports the algorithm's requirements. The intrinsic non-volatility, non-linearity, and memory of spintronic devices make them appealing candidates for neuromorphic devices. Here we focus on the reservoir computing paradigm, a recurrent network with a simple training algorithm suitable for computation with spintronic devices since they can provide the properties of non-linearity and memory. We review technologies and methods for developing neuromorphic spintronic devices and conclude with critical open issues to address before such devices become widely used.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

COMPASS: A Formal Framework and Aggregate Dataset for Generalized Surgical Procedure Modeling

Kay Hutchinson , Ian Reyes , Zongyu Li , Homa Alemzadeh

分类：机器人

2022-09-14

目的：我们提出了一个正式的框架，用于使用统一的运动原始图（MPS）作为基本手术动作来建模手术任务，以实现不同数据集的更客观的标记和聚集，并培训通用模型，以实现手术动作识别。方法：我们使用我们的框架来创建上下文和运动原始骨料外科手术集（指南针），包括来自三个公共可用数据集（拼图，桌子，桌子和Rosma）的六个干燥LAB手术任务标签。提出了标记手术环境和自动转换为MPS的方法。我们提出了一项任务（Loto）交叉验证方法，以评估模型概括为看不见的任务的能力。结果：我们的上下文标签方法达到了众包的共识标签与专家外科医生之间的几乎完美的一致性。对MPS的任务进行分割，可以生成单独的左右笔录，并显着改善Loto的性能。我们发现，如果对具有相同上下文的任务和/或来自同一数据集的任务进行了培训，则MP细分模型的性能最佳。结论：所提出的框架可以基于上下文和细粒度的MPS对外科数据进行高质量的标记。使用MPS对外科手术任务进行建模可以使不同数据集的汇总用于训练动作识别模型，这些模型可以比在手势级别训练的模型更好地概括地看不见的任务。意义：我们的正式框架和汇总数据集可以支持用于手术过程分析，技能评估，错误检测和自治的模型和算法的开发。

translated by 谷歌翻译

Profiling Television Watching Behaviour Using Bayesian Hierarchical Joint Models for Time-to-Event and Count Data

Rafael A. Moral , Zhi Chen , Shuai Zhang , Sally McClean , Gabriel R. Palma , Brahim Allan , Ian Kegel

分类： (统计)机器学习

2022-09-06

在许多行业中，客户流失预测是一项宝贵的任务。在电信中，鉴于数据的高维度以及确定潜在的挫败感签名是多么困难，这可能代表了关于未来流失行为的重要驱动因素。在这里，我们提出了一个新颖的贝叶斯分层联合模型，该模型能够根据不同电视观看旅程中发生的事件以及事件之间需要多长时间来表征客户资料。该模型大幅度地将数据的维度从每个客户的数千个观察值降低到11个客户级参数估计和随机效果。我们使用来自40个BT客户（有20名活跃和20名最终取消订阅的20人）的数据测试我们的方法，他们的电视观看行为是从2019年10月到2019年12月的，总计约为半百万。使用贝叶斯分层模型的参数估计和随机效应采用不同的机器学习技术，作为在验证中与100 \％真实的正率和14 \％的假正率相关的最高92 \％精度可预测流失的精度放。我们提出的方法是降低数据维度的有效方法，同时保持了高描述性和预测能力。我们提供代码以在https://github.com/rafamoral/profiling_tv_watching_behaviour上实现贝叶斯模型。

translated by 谷歌翻译

Masked Sinogram Model with Transformer for ill-Posed Computed Tomography Reconstruction: a Preliminary Study

Zhengchun Liu , Rajkumar Kettimuthu , Ian Foster

分类：计算机视觉 | 机器学习

2022-09-03

计算机断层扫描（CT）是一种成像技术，其中以不同角度（称为投影或扫描）收集有关对象的信息。然后，通过解决反问题来产生显示切片的内部结构的横截面图像。受辐射剂量，投影角，产生的图像等某些因素的限制可能是嘈杂的或包含伪像的。受到《变形金刚在自然语言处理》中的成功的启发，这项初步研究的核心思想是将层析成像的投影视为单词令牌，而整个横截面（又称Sinogram）的整体扫描是在句子中作为句子。自然语言处理。然后，我们通过训练蒙版辛图模型（MSM）和微调MSM来探索基础模型的想法，以获取各种下游应用程序，包括数据集合限制（例如，光子预算）和数据驱动的解决方案，以近似于数据驱动的解决方案CT重建的逆问题。本研究中使用的模型和数据可在https://github.com/lzhengchun/tomotx上获得。

translated by 谷歌翻译

A Failure Identification and Recovery Framework for a Planar Reconfigurable Cable Driven Parallel Robot

Adhiti Raman , Ian Walker , Venkat Krovi , Matthias Schmid

分类：机器人

2022-09-02

在电缆驱动的平行机器人（CDPR）中，单个电缆故障通常会导致整个机器人的完全故障。但是，通常可以通过重新配置框架上的电缆附件来恢复丢失的静态工作空间（由于故障）。通过将运动冗余以在实时冗余分辨率控制器中操纵的移动线性滑块的形式添加到机器人中，从而引入了此功能。提出的工作将该控制器与在线故障检测框架相结合，以开发自动任务恢复的完整失误耐受控制方案。该解决方案通过将最终效应器的姿势估计与仅依靠最终效应器信息的交互式多重模型（IMM）算法相结合，从而提供了鲁棒性。然后将故障和姿势估计方案绑定到冗余分辨率方法中，以产生无缝的自动任务（轨迹）恢复方法，以实现电缆故障。

translated by 谷歌翻译

A Model of Anaphoric Ambiguities using Sheaf Theoretic Quantum-like Contextuality and BERT

Kin Ian Lo , Mehrnoosh Sadrzadeh , Shane Mansfield

分类：自然语言处理 | 人工智能 | 机器学习 | 神经与进化计算

2022-08-11

自然语言的歧义并不能阻止我们使用它，而环境有助于跨越想法。尽管如此，它们还是对合格机器的开发构成了一个关键挑战，以理解自然语言并像人类一样使用它。情境性是量子力学中无与伦比的现象，在其中提出了不同的数学形式主义来理解和推理。在本文中，我们为表现出类似量子的上下文性的放置歧义构建了一个模式。我们使用最近开发的捆绑理论背景性标准，该标准适用于信号模型。然后，我们利用神经词嵌入引擎bert将模式实例化为自然语言示例，并为实例提取概率分布。结果，在Bert Corpora使用的自然语言中发现了大量的捆绑示例。我们的希望是，这些示例将为将来的研究铺平道路，并找到将量子计算应用程序扩展到自然语言处理的方法。

translated by 谷歌翻译

A Quantum Natural Language Processing Approach to Pronoun Resolution

Hadi Wazni , Kin Ian Lo , Lachlan McPheat , Mehrnoosh Sadrzadeh

分类：自然语言处理

2022-08-10

我们使用具有软次指数模式的兰贝克微积分来建模和理由，例如Anaphora和Ellipsis。该逻辑的语义是通过使用截短的Fock空间获得的，这是我们以前的工作中开发的。我们通过新的字符串图描述了这些语义计算。Fock Space语义的优势是，使用机器学习可以从大量数据中学到其术语，并且可以在主流自然语言任务上进行实验。此外，由于从向量空间到量子电路的现有翻译，我们还可以在量子计算机及其模拟器（例如IBMQ范围）上学习这些术语。我们将现有的翻译扩展到Fock空间，并为话语关系开发量子电路语义。然后，我们在确定的代词分辨率任务中对这些电路的IBMQ进行了模拟，其中在解析过度时，模型记录了最高精度。

translated by 谷歌翻译

S4: a High-sparsity, High-performance AI Accelerator

Ian En-Hsu Yen , Zhibin Xiao , Dongkuan Xu

分类：机器学习

2022-07-16

利用稀疏性神经网络的稀疏性已成为减少记忆足迹，I/O成本和计算工作量的最潜在方法之一。而且，由于已经考虑了较大的模型尺寸以及预训练巨型模型的趋势，因此可以利用的稀疏度已变得更高。另一方面，与已广泛支持的选项相比，大多数计算平台中不支持通过高度稀疏性加速。在这项工作中，我们介绍了第一个支持高度稀疏加速度的商业硬件平台，最高32次-S4。结合最先进的稀疏修剪技术，我们在主流推理平台（例如NVIDIA T4）上展示了S4上的几次实用推断。我们还表明，在实践中，较大尺寸的稀疏模型比较小尺寸的密集模型可以实现更高的精度和更高的S4吞吐量。

translated by 谷歌翻译